Estimating Acoustic Properties Accurately and Reliably in Samples of Conversational Speech for Clinical Applications
نویسنده
چکیده
Samples of everyday conversations are being collected and analyzed in a growing number of applications, ranging from studying behavior in social psychology to clinical assessment of voice pathology and even cognitive function. Aside from the spoken words, the acoustic properties of speech samples can provide important cues in these applications. The goal of this study is to develop robust and accurate algorithms for estimating speech features. Researchers have employed a number of techniques in time and frequency domains to estimate, for example, fundamental frequency and harmonic-tonoise ratio (HNR). However, their limitations hinder applications in clinical assessments. Time domain methods often ignore the frequency and amplitude variations of speech over the analysis frame, and on the other hand, the resolution of short time Fourier transform does not provide the necessary time-frequency resolution to capture small amount of perturbation observed in, for example, Parkinson’s disease (PD). The purpose of this study is to achieve accurate and reliable estimation of fundamental frequency, HNR, jitter, and shimmer for clinical speech analysis. Adopting a 1 time-varying harmonic model (TVHM) for representing speech, we quantify hoarseness, a salient feature of PD, as well as jitter and shimmer. We verify our implementation of TVHM and pitch estimation on Keele data set. Results show that pitch detected using TVHM outperforms those from get-f0 , an algorithm employed in many popular tools (wavesurfer, praat,etc). Further, we demonstrated the utility of our measures for hoarseness, jitter and shimmer in predicting clinical rating of severity of Parkinson’s disease.
منابع مشابه
An Introduction to Speech Sciences (Acoustic Analysis of Speech)
Speech sciences deal with the acoustical characteristics of speech by means of sophisticated soft wares as well as hard wares. Although, a speech science is a well known science in the developed countries, especially the western societies, however, it has been remained almost unknown in Iran, though, in recent years a group of scholars have been involved in this branch of science. The applicati...
متن کاملThe Function of Pitch Range Variations in Samples of Emotional Expressions in Persian
This study aims at investigating the interface between emotion and intonation patterns (more specifically, duration and pitch amplitude of speech). To this end, the acoustic properties of spectral parameters related to speech prosody are investigated. The results of acoustic and Statistical analysis show that mean level and range of FO in the contours vary strongly as a function of the degree o...
متن کاملMeasuring the intelligibility of conversational speech in children.
Conversational speech is the most socially-valid context for evaluating speech intelligibility, but it is not routinely examined. This may be because it is difficult to reliably count the number of words in the unintelligible portions of the sample. In this study four different approaches to dealing with this problem are examined. Each is based on the assumption that it is possible to perceive ...
متن کاملAn Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model
This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...
متن کاملUse of higher level linguistic structure in acoustic modeling for speech recognition
Current speech recognition systems perform poorly on conversational speech as compared to read speech, largely because of the additional acoustic variability observed in conversational speech. Our hypothesis is that there are systematic effects, related to higher level structures, that are not being captured in the current acoustic models. In this paper we describe a method to extend standard c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011